TidyModels is the newer version of Max Kuhn’s CARET and can be used for a number of machine learning tasks. This modelling framework takes a different approach to modelling - allowing for a more structured workflow, and like tidyverse, has a whole set of packages for making the machine learning process easier. I will touch on a number of these packages in the following sub sections.
This package supercedes that in R for Data Science, as Hadley Wickham admitted he needed a better modelling solution at the time, and Max Kuhn and team have delivered on this.
The aim of this webinar is to:
The framework of a TidyModels approach flows as so:
I will show you the steps in the following tutorials.
I will load in the stranded patient data - a stranded patient is a patient that has been in hospital for longer than 7 days and we also call these Long Waiters. The import steps are below and use the native readr package to load this in:
As this is a classification problem we need to look at the classification imbalance in the predictor variable i.e. the thing we are trying to predict.
The following code looks at the class imbalance as a volume and proportion and then I am going to use the second index from the class balance table i.e. the number of people who are long waiters is going to be lower than those that aren’t, otherwise we are offering a very poor service to patients.
##
## 0 1
## 0.7526316 0.2473684
##
## 0 1
## 286 94
It is always a good idea to observe the data structures of the data items we are trying to predict. I generally separate the names of the variables out into factors, integer / numerics and character vectors:
## character(0)
## [1] "anb_angle" "bjork_sum" "overbite"
## [4] "max_ci_to_sn" "max_ci_to_op" "impa"
## [7] "mand_ci_to_op" "interincisal_angle" "ul_to_eline"
## [10] "ll_to_eline" "nasolabial_angle" "archlen_upper"
## [13] "archlen_lower" "molar_key_score" "protrusion_index"
## [16] "chief_complaint_score" "result_code"
## [1] "profile"
The Rsample package makes it easy to divide your data up. To view all the functionality navigate to the Rsample vignette.
We will divide the data into a training and test sample. This approach is the simplest method to testing your models accuracy and future performance on unseen data. Here we are going to treat the test data as the unseen data to allow us to evaluate if the model is fit for being released into the wild, or not.
Recipes is an excellent package. I have for years done feature, dummy and other types of coding and feature selection with CARET, also a great package, but this makes the process much simpiler. The first part of the recipe is to fit your model and then you add recipe steps, this is supposed to replicate baking adding the specific ingredients. For all the particular steps that recipes contains, go directly to the recipes site.
## Recipe
##
## Inputs:
##
## role #variables
## outcome 1
## predictor 17
##
## Operations:
##
## Dummy variables from all_nominal(), -all_outcomes()
## Zero variance filter on all_predictors()
## Centering and scaling for all_predictors()
To look up some of these steps, I have previously covered them in a CARET tutorial. For all the list of recipes steps refer to the link above the code chunk.
The package Parsnip is the model to work with TidyModels. Parsnip still does not have many of the algorithms present in CARET, but it makes it much simpler to work in the tidy way.
Here we will create a basic logistic regression as our baseline model. If you want a second tutorial around model ensembling in TidyModels with Baguette and Stacks, then I would be happy to arrange this, but these are a session in themselves.
The reason Logistic Regression is the choice as it is a nice generalised linear model that most people have encountered.
TidyModels has a workflow structure which we will build in the next few steps:
In TidyModels you have to create an instance of the model in memory before working with it:
## Logistic Regression Model Specification (classification)
##
## Computational engine: glm
The next step is to create the model workflow.
Now it is time to do the workflow to connect the newly instantiated model together:
## == Workflow ====================================================================
## Preprocessor: Recipe
## Model: logistic_reg()
##
## -- Preprocessor ----------------------------------------------------------------
## 3 Recipe Steps
##
## * step_dummy()
## * step_zv()
## * step_normalize()
##
## -- Model -----------------------------------------------------------------------
## Logistic Regression Model Specification (classification)
##
## Computational engine: glm
The next step is fitting the model to our data:
The final step is to use the pull_workflow_fit() parameter to retrieve the fit on the workflow:
## # A tibble: 19 x 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -1.62 17.5 -0.0927 0.926
## 2 anb_angle -0.0456 0.177 -0.258 0.796
## 3 bjork_sum -0.0923 0.155 -0.596 0.552
## 4 overbite 0.193 0.193 1.00 0.317
## 5 max_ci_to_sn -0.0158 0.181 -0.0872 0.931
## 6 max_ci_to_op 0.121 0.197 0.614 0.539
## 7 impa 0.0207 0.159 0.130 0.896
## 8 mand_ci_to_op -0.210 0.179 -1.17 0.242
## 9 interincisal_angle -0.313 0.166 -1.88 0.0597
## 10 ul_to_eline 0.00217 0.195 0.0111 0.991
## 11 ll_to_eline 0.365 0.192 1.91 0.0567
## 12 nasolabial_angle -0.273 0.156 -1.75 0.0797
## 13 archlen_upper -0.223 0.193 -1.16 0.247
## 14 archlen_lower -0.00651 0.193 -0.0337 0.973
## 15 molar_key_score -0.273 0.177 -1.54 0.123
## 16 protrusion_index 0.0821 0.192 0.428 0.669
## 17 chief_complaint_score 0.353 0.191 1.84 0.0653
## 18 profile_convex 5.24 351. 0.0150 0.988
## 19 profile_straight 5.05 332. 0.0152 0.988
As an optional step I have created a plot to visualise the significance. This will only work with linear, and generalized linear models, that analyse p values from t tests and finding the probability value from the t distribution. The visualisation code is contained hereunder:
## Saving 7 x 5 in image
Now we will assess how well the model predicts on the test (holdout) data to evaluate if we want to productionise the model, or abandon it at this stage. This is implemented below:
## Warning: There are new levels in a factor: staright
## Warning: There are new levels in a factor: staright
## LR_Class LR_NotStrandedProb LR_StrandedProb
## 90 0 0.6975468 0.3024532
## 91 0 0.7581275 0.2418725
## 92 0 0.8939423 0.1060577
## 93 <NA> NA NA
## 94 0 0.7299767 0.2700233
## 95 0 0.6908952 0.3091048
Yardstick is another tool in the TidyModels arsenal. It is useful for generating quick summary statistics and evaluation metrics. I will grab the area under the curve estimates to show how well the model fits:
I like ROC plots - but they only show you sensitivity how well it is at predicting stranded and the inverse how good it is at predicting not stranded. I like to look at the overall accuracy and balanced accuracy on a confusion matrix, for binomial classification problems.
I use the CARET package and utilise the confusion matrix functions to perform this:
## Warning: package 'caret' was built under R version 4.1.1
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following objects are masked from 'package:yardstick':
##
## precision, recall, sensitivity, specificity
## The following object is masked from 'package:purrr':
##
## lift
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 62 2
## 1 27 2
##
## Accuracy : 0.6882
## 95% CI : (0.5837, 0.7802)
## No Information Rate : 0.957
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.0493
##
## Mcnemar's Test P-Value : 8.324e-06
##
## Sensitivity : 0.69663
## Specificity : 0.50000
## Pos Pred Value : 0.96875
## Neg Pred Value : 0.06897
## Prevalence : 0.95699
## Detection Rate : 0.66667
## Detection Prevalence : 0.68817
## Balanced Accuracy : 0.59831
##
## 'Positive' Class : 0
##
On the back of the Advanced Modelling course I did for the NHS-R Community I have created a package to work with the outputs of a confusion matrix. This package is aimed at the flattening of binary and multi-class confusion matrix results.
As this is a binary classification problem, then there is the potential to store the outputs of the model into a database. The ConfusionTableR package can do this for you and to implement this in a binary classification model, you would use the binary_class_cm to output this to a list whereby you can then flatten the table. This works very much like the broom package for linear regression outputs.
## [INFO] Building a record level confusion matrix to store in dataset
## [INFO] Build finished and to expose record level cm use the record_level_cm list item
## Rows: 1
## Columns: 23
## $ Pred_0_Ref_0 <int> 62
## $ Pred_1_Ref_0 <int> 27
## $ Pred_0_Ref_1 <int> 2
## $ Pred_1_Ref_1 <int> 2
## $ Accuracy <dbl> 0.688172
## $ Kappa <dbl> 0.0493479
## $ AccuracyLower <dbl> 0.5837361
## $ AccuracyUpper <dbl> 0.7802377
## $ AccuracyNull <dbl> 0.9569892
## $ AccuracyPValue <dbl> 1
## $ McnemarPValue <dbl> 8.323556e-06
## $ Sensitivity <dbl> 0.6966292
## $ Specificity <dbl> 0.5
## $ Pos.Pred.Value <dbl> 0.96875
## $ Neg.Pred.Value <dbl> 0.06896552
## $ Precision <dbl> 0.96875
## $ Recall <dbl> 0.6966292
## $ F1 <dbl> 0.8104575
## $ Prevalence <dbl> 0.9569892
## $ Detection.Rate <dbl> 0.6666667
## $ Detection.Prevalence <dbl> 0.688172
## $ Balanced.Accuracy <dbl> 0.5983146
## $ cm_ts <dttm> 2021-12-06 11:41:45
The next markdown document will look at how to improve your models with model selection, K-fold cross validation and hyperparameter tuning. I was thinking of doing an ensembling course off the back of this, so please contact me if that would be interesting to you.
I will now save the R image data into file, as we will pick this up in the next markdown document.
The first markdown document showed you how to build your first TidyModels model on an healthcare dataset. This could be a ML model you simply tweak for your own uses. I will now load the data back in and resume where we left off:
The first step will involve something called cross validation (see supporting workshop slides). The essence of cross validation is that you take sub samples of the training dataset. This is done to emulate how well the model will perform on unseen data samples when out in the wild (production):
As the image shows - the folds take a sampe of the training set and each randomly selected fold acts as the test sample. We then use a final hold out validation set to finally test the model. This will be shown in the following section.
We will use the previous trained logistic regression model with resamples to improve the results of the cross validation:
## Warning: package 'rlang' was built under R version 4.1.1
## Warning: package 'vctrs' was built under R version 4.1.1
We will now collect the metrics using the tune package and the collect_metrics function:
## # A tibble: 2 x 6
## .metric .estimator mean n std_err .config
## <chr> <chr> <dbl> <int> <dbl> <chr>
## 1 accuracy binary 0.730 10 0.0194 Preprocessor1_Model1
## 2 roc_auc binary 0.583 10 0.0569 Preprocessor1_Model1
## The true accuracy of the model is between the resample testing:72.98
## The validation sample: .NULL
This shows that the true accuracy value is somewhere between the reported results from the resampling method and those in our validation sample.
The following example will move on from the logistic regression and aim to build a random forest, and later a decision tree. Other options in Parnsip would be to use a gradient boosted tree to amp up the results further. In addition, I aim at teaching a follow up webinar to this for ensembling - specifically model stacking (Stacks package) and bagging (Baguette package).
The first step, as with the logistic regression example, if to define and instantiate the model:
## Random Forest Model Specification (classification)
##
## Main Arguments:
## trees = 500
##
## Computational engine: ranger
Then we are going to fit the model to the previous training data:
## parsnip model object
##
## Fit time: 200ms
## Ranger result
##
## Call:
## ranger::ranger(x = maybe_data_frame(x), y = y, num.trees = ~500, num.threads = 1, verbose = FALSE, seed = sample.int(10^5, 1), probability = TRUE)
##
## Type: Probability estimation
## Number of trees: 500
## Sample size: 285
## Number of independent variables: 17
## Mtry: 4
## Target node size: 10
## Variable importance mode: none
## Splitrule: gini
## OOB prediction error (Brier s.): 0.18743
We will aim to increase the sample representation in this model by fitting it to a resamples object, in parsnip and rsample:
## Warning: package 'ranger' was built under R version 4.1.1
## # Resampling results
## # 10-fold cross-validation
## # A tibble: 10 x 4
## splits id .metrics .notes
## <list> <chr> <list> <list>
## 1 <split [256/29]> Fold01 <tibble [2 x 4]> <tibble [0 x 1]>
## 2 <split [256/29]> Fold02 <tibble [2 x 4]> <tibble [0 x 1]>
## 3 <split [256/29]> Fold03 <tibble [2 x 4]> <tibble [0 x 1]>
## 4 <split [256/29]> Fold04 <tibble [2 x 4]> <tibble [0 x 1]>
## 5 <split [256/29]> Fold05 <tibble [2 x 4]> <tibble [0 x 1]>
## 6 <split [257/28]> Fold06 <tibble [2 x 4]> <tibble [0 x 1]>
## 7 <split [257/28]> Fold07 <tibble [2 x 4]> <tibble [0 x 1]>
## 8 <split [257/28]> Fold08 <tibble [2 x 4]> <tibble [0 x 1]>
## 9 <split [257/28]> Fold09 <tibble [2 x 4]> <tibble [0 x 1]>
## 10 <split [257/28]> Fold10 <tibble [2 x 4]> <tibble [0 x 1]>
The next step is to collect the resample metrics:
## # A tibble: 2 x 6
## .metric .estimator mean n std_err .config
## <chr> <chr> <dbl> <int> <dbl> <chr>
## 1 accuracy binary 0.754 10 0.0276 Preprocessor1_Model1
## 2 roc_auc binary 0.559 10 0.0361 Preprocessor1_Model1
The model predictive power is maxing out at about 78%. I know this is due to the fact that the data is dummy data and most of the features that are contained in the model have a weak association to the outcome variable.
What you would need to do after this is look for more representative features of what causes a patient to stay a long time in hospital. This is where the clinical context comes into play.
We are going to now create a decision tree and we are going to tune the hyperparameters using the dials package. The dials package contains a list of hyperparameter tuning methods and is useful for creating quick hyperparameter grids and aiming to optimise them.
Like all the other steps, the first thing to do is build the decision tree. Note - the reason set_model(“classification”) is because the thing we are predicting is a factor. If this was a continuous variable, then you would need to switch this to regression. However, the model development for regression is identical to classification.
## Decision Tree Model Specification (classification)
##
## Main Arguments:
## cost_complexity = tune()
## tree_depth = tune()
##
## Computational engine: rpart
The next step is to fill these blank values for cost complexity and tree depth - see the documentation for parsnip about these meaning, but decision trees have a cost value which minimises the splits and the depth of the tree is how far down you go.
We will now create the object:
## # A tibble: 20 x 2
## cost_complexity tree_depth
## <dbl> <int>
## 1 0.0000000001 1
## 2 0.000000001 1
## 3 0.00000001 1
## 4 0.0000001 1
## 5 0.000001 1
## 6 0.00001 1
## 7 0.0001 1
## 8 0.001 1
## 9 0.01 1
## 10 0.1 1
## 11 0.0000000001 2
## 12 0.000000001 2
## 13 0.00000001 2
## 14 0.0000001 2
## 15 0.000001 2
## 16 0.00001 2
## 17 0.0001 2
## 18 0.001 2
## 19 0.01 2
## 20 0.1 2
The tuning process, and modelling process, normally needs the ML engineer to access the full potential of your machine. The next steps show how to register the cores on your machine and max them out for training the model and doing grid searching:
## [1] 3
## socket cluster with 3 nodes on host 'localhost'
Next, I will create the model workflow, as we have done a few times before:
This ggplot helps to visualise how the manual tuning has gone on and will show where the best tree depth occurs in terms of the cost complexity (the number of terminal or leaf nodes):
## Saving 7 x 5 in image
This shows that you only need a depth of 4 to get the optimal accuracy. However, the tune package helps us out with this as well.
The tune package allows us to select the best candidate model, with the most optimal set of hyperparameters:
## # A tibble: 5 x 8
## cost_complexity tree_depth .metric .estimator mean n std_err .config
## <dbl> <int> <chr> <chr> <dbl> <int> <dbl> <chr>
## 1 0.0000000001 4 roc_auc binary 0.545 10 0.0542 Preprocesso~
## 2 0.000000001 4 roc_auc binary 0.545 10 0.0542 Preprocesso~
## 3 0.00000001 4 roc_auc binary 0.545 10 0.0542 Preprocesso~
## 4 0.0000001 4 roc_auc binary 0.545 10 0.0542 Preprocesso~
## 5 0.000001 4 roc_auc binary 0.545 10 0.0542 Preprocesso~
## # A tibble: 1 x 3
## cost_complexity tree_depth .config
## <dbl> <int> <chr>
## 1 0.0000000001 4 Preprocessor1_Model021
The next step is to us the best tree to make our predictions.
## == Workflow ====================================================================
## Preprocessor: Formula
## Model: decision_tree()
##
## -- Preprocessor ----------------------------------------------------------------
## result_code ~ .
##
## -- Model -----------------------------------------------------------------------
## Decision Tree Model Specification (classification)
##
## Main Arguments:
## cost_complexity = 1e-10
## tree_depth = 4
##
## Computational engine: rpart
Make a prediction against this finalised tree:
## == Workflow [trained] ==========================================================
## Preprocessor: Formula
## Model: decision_tree()
##
## -- Preprocessor ----------------------------------------------------------------
## result_code ~ .
##
## -- Model -----------------------------------------------------------------------
## n= 285
##
## node), split, n, loss, yval, (yprob)
## * denotes terminal node
##
## 1) root 285 65 0 (0.7719298 0.2280702)
## 2) ll_to_eline< 3.5 202 36 0 (0.8217822 0.1782178)
## 4) archlen_upper>=-5.5 178 27 0 (0.8483146 0.1516854) *
## 5) archlen_upper< -5.5 24 9 0 (0.6250000 0.3750000)
## 10) interincisal_angle>=110.5 16 3 0 (0.8125000 0.1875000) *
## 11) interincisal_angle< 110.5 8 2 1 (0.2500000 0.7500000) *
## 3) ll_to_eline>=3.5 83 29 0 (0.6506024 0.3493976)
## 6) nasolabial_angle>=84.5 68 20 0 (0.7058824 0.2941176)
## 12) interincisal_angle>=121.5 13 0 0 (1.0000000 0.0000000) *
## 13) interincisal_angle< 121.5 55 20 0 (0.6363636 0.3636364)
## 26) impa>=95.5 48 14 0 (0.7083333 0.2916667) *
## 27) impa< 95.5 7 1 1 (0.1428571 0.8571429) *
## 7) nasolabial_angle< 84.5 15 6 1 (0.4000000 0.6000000) *
We will look at global variable importance. As mentioned prior, to look at local patient level importance, use the LIME package.
## Saving 7 x 5 in image
This was derived when we looked at the logistic regression significance that these would be the important variables, due to their linear significance.
The last step is to create the final predictions from the tuned decision tree:
## Warning: All models failed. See the `.notes` column.
## NULL
## # A tibble: 0 x 2
## # ... with 2 variables: id <chr>, .predictions <???>
You could do similar with viewing this object in the confusion matrix add in, but I will view this on a plot:
One last point to note - to inspect any of the tuning parameters and hyperparameters for the models you can use the args function to return these - examples below:
## function (mode = "unknown", engine = "rpart", cost_complexity = NULL,
## tree_depth = NULL, min_n = NULL)
## NULL
## function (mode = "classification", engine = "glm", penalty = NULL,
## mixture = NULL)
## NULL
## function (mode = "unknown", engine = "ranger", mtry = NULL, trees = NULL,
## min_n = NULL)
## NULL
Finally we will look at implementing a really powerful model that is used a lot, by me, in Kaggle competitions. Last month I finished in the top 6% of Kaggle entrants, just using this model and doing some preprocessing.
The logic of XGBoost is awesome. Essentially this performs multiple training iterations and over each iteration the model learns the error. To visualise this diagramatically:
This is how the model learns, but can easily overfit, thus why hyperparameter tuning is needed.
Let’s learn how to train one of these Kaggle-beaters!
From the previous recipe we created applying the preprocessing steps to the stranded data we are going to try and improve our model using tuning, resampling and powerful model training techniques, using bagged trees and gradient boosting.
The next step, as we covered earlier, is to create a grid search in dials to go over each one of these hyperparameters to tune the model.
## # A tibble: 100 x 4
## min_n tree_depth learn_rate loss_reduction
## <int> <int> <dbl> <dbl>
## 1 20 11 5.71e- 6 0.202
## 2 18 8 1.02e- 5 0.0000000482
## 3 16 6 6.89e- 8 0.00000688
## 4 3 6 1.40e- 2 0.000232
## 5 15 14 1.18e- 4 0.00000000120
## 6 31 14 8.35e- 6 0.00686
## 7 40 13 3.11e- 7 0.00120
## 8 2 14 1.18e- 5 6.60
## 9 5 8 1.09e- 7 0.00000000945
## 10 4 7 2.23e-10 2.00
## # ... with 90 more rows
The next step is to set up the workflow for the grid.
The next step is to now tune the model using your tuning grid:
Let’s now get the best hyperparameters for the model:
## # A tibble: 5 x 10
## min_n tree_depth learn_rate loss_reduction .metric .estimator mean n
## <int> <int> <dbl> <dbl> <chr> <chr> <dbl> <int>
## 1 3 4 2.30e- 8 0.000122 roc_auc binary 0.612 10
## 2 14 11 1.23e- 2 0.0444 roc_auc binary 0.571 10
## 3 12 6 3.22e- 2 0.0000000154 roc_auc binary 0.568 10
## 4 6 14 1.41e-10 0.0000000791 roc_auc binary 0.566 10
## 5 3 6 1.40e- 2 0.000232 roc_auc binary 0.563 10
## # ... with 2 more variables: std_err <dbl>, .config <chr>
Now let’s select the most performant from the model:
The final stage is finalizing the model to use the best parameters:
## Boosted Tree Model Specification (classification)
##
## Main Arguments:
## trees = 500
## min_n = 3
## tree_depth = 4
## learn_rate = 2.3033499372063e-08
## loss_reduction = 0.000122357325553992
##
## Computational engine: xgboost
## [11:48:41] WARNING: amalgamation/../src/learner.cc:1095: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
## Warning: There are new levels in a factor: staright
## [11:48:42] WARNING: amalgamation/../src/learner.cc:1095: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
## # A tibble: 2 x 3
## .metric .estimator .estimate
## <chr> <chr> <dbl>
## 1 accuracy binary 0.642
## 2 kap binary 0.00185
The final step we will use ConfusionTableR to store the classification results in a csv file:
## Warning: package 'data.table' was built under R version 4.1.1
##
## Attaching package: 'data.table'
## The following object is masked from 'package:rlang':
##
## :=
## The following object is masked from 'package:purrr':
##
## transpose
## The following objects are masked from 'package:dplyr':
##
## between, first, last
## [INFO] Building a record level confusion matrix to store in dataset
## [INFO] Build finished and to expose record level cm use the record_level_cm list item
## $confusion_matrix
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 57 25
## 1 9 4
##
## Accuracy : 0.6421
## 95% CI : (0.5372, 0.7379)
## No Information Rate : 0.6947
## P-Value [Acc > NIR] : 0.8886
##
## Kappa : 0.0019
##
## Mcnemar's Test P-Value : 0.0101
##
## Sensitivity : 0.8636
## Specificity : 0.1379
## Pos Pred Value : 0.6951
## Neg Pred Value : 0.3077
## Prevalence : 0.6947
## Detection Rate : 0.6000
## Detection Prevalence : 0.8632
## Balanced Accuracy : 0.5008
##
## 'Positive' Class : 0
##
##
## $record_level_cm
## Pred_0_Ref_0 Pred_1_Ref_0 Pred_0_Ref_1 Pred_1_Ref_1 Accuracy Kappa
## 1 57 9 25 4 0.6421053 0.001854141
## AccuracyLower AccuracyUpper AccuracyNull AccuracyPValue McnemarPValue
## 1 0.5372025 0.7378857 0.6947368 0.8886229 0.01009731
## Sensitivity Specificity Pos.Pred.Value Neg.Pred.Value Precision Recall
## 1 0.8636364 0.137931 0.695122 0.3076923 0.695122 0.8636364
## F1 Prevalence Detection.Rate Detection.Prevalence Balanced.Accuracy
## 1 0.7702703 0.6947368 0.6 0.8631579 0.5007837
## cm_ts
## 1 2021-12-06 11:48:42
##
## $cm_tbl
## PredLabel Freq
## 1 Pred_0_Ref_0 57
## 2 Pred_1_Ref_0 9
## 3 Pred_0_Ref_1 25
## 4 Pred_1_Ref_1 4
##
## $last_run
## [1] "2021-12-06 11:48:42 IST"
## Pred_0_Ref_0 Pred_1_Ref_0 Pred_0_Ref_1 Pred_1_Ref_1 Accuracy Kappa
## 1 57 9 25 4 0.6421053 0.001854141
## AccuracyLower AccuracyUpper AccuracyNull AccuracyPValue McnemarPValue
## 1 0.5372025 0.7378857 0.6947368 0.8886229 0.01009731
## Sensitivity Specificity Pos.Pred.Value Neg.Pred.Value Precision Recall
## 1 0.8636364 0.137931 0.695122 0.3076923 0.695122 0.8636364
## F1 Prevalence Detection.Rate Detection.Prevalence Balanced.Accuracy
## 1 0.7702703 0.6947368 0.6 0.8631579 0.5007837
## cm_ts
## 1 2021-12-06 11:48:42
## cm_outputs.csv
## cm_outputs
## $confusion_matrix
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 57 25
## 1 9 4
##
## Accuracy : 0.6421
## 95% CI : (0.5372, 0.7379)
## No Information Rate : 0.6947
## P-Value [Acc > NIR] : 0.8886
##
## Kappa : 0.0019
##
## Mcnemar's Test P-Value : 0.0101
##
## Sensitivity : 0.8636
## Specificity : 0.1379
## Pos Pred Value : 0.6951
## Neg Pred Value : 0.3077
## Prevalence : 0.6947
## Detection Rate : 0.6000
## Detection Prevalence : 0.8632
## Balanced Accuracy : 0.5008
##
## 'Positive' Class : 0
##
##
## $record_level_cm
## Pred_0_Ref_0 Pred_1_Ref_0 Pred_0_Ref_1 Pred_1_Ref_1 Accuracy Kappa
## 1 57 9 25 4 0.6421053 0.001854141
## AccuracyLower AccuracyUpper AccuracyNull AccuracyPValue McnemarPValue
## 1 0.5372025 0.7378857 0.6947368 0.8886229 0.01009731
## Sensitivity Specificity Pos.Pred.Value Neg.Pred.Value Precision Recall
## 1 0.8636364 0.137931 0.695122 0.3076923 0.695122 0.8636364
## F1 Prevalence Detection.Rate Detection.Prevalence Balanced.Accuracy
## 1 0.7702703 0.6947368 0.6 0.8631579 0.5007837
## cm_ts
## 1 2021-12-06 11:48:42
##
## $cm_tbl
## PredLabel Freq
## 1 Pred_0_Ref_0 57
## 2 Pred_1_Ref_0 9
## 3 Pred_0_Ref_1 25
## 4 Pred_1_Ref_1 4
##
## $last_run
## [1] "2021-12-06 11:48:42 IST"
That wraps up this part of the demo, the xgboost has improved performance massively, but it has done it in the wrong direction, as it is picking up many more true negative examples.
If you are interested in a further session on ensembling - then I would be happy to go over the Stacks and Baguette packages for model stacking and bagging. These are relatively new additions to TidyModels and they are not as optimised as some of the caret packages, but I would be happy to show you how these are implemented.